** Data reading and variable selection from raw data
** UK National Survey of Sexual Attitudes and Lifestyle 2000-2001

** 01. Reading data **

cap log close
clear all
cd /*insert you work directory here*/
use /*read your data here*/ 
set more off
numlabel, add


** 02. Consructing year and country variables **

ge year=2000
lab var year "survey year"

ge country=826
lab var country "ISO country code"
//uk: 826 


** 03. ID variables **

drop pid
ge pid=sserial
lab var pid "person id"


** 04. Basic Demographics (Sex and Age/birth year) **

ge sex=rsex
lab var sex "sex"
lab def sex 1 "male" 2 "female"
lab val sex sex

ge age=dage
lab var age "age"  /* at interview */

ge birthyr=.
replace birthyr=dateyoi-age
lab var birthyr "year of birth"


** 05. Siblings **

ge nsibs=nosibs  /* inc step-siblings, exc R */
recode nsibs (98/99=.)  /* recode 98 & 99 to missing */
lab var nsibs "number of siblings"

* birth order: only rough position is available
ge birthorder=sibspos
recode birthorder (-1 1=1) (3=2) (2=3) (9=.)
lab var birthorder "birth order (3 cats)"
lab def birthorder 1 "oldest" 2 "in between" 3 "youngest"
lab val birthorder birthorder


** 06. Own education **

drop educ educ2

rename exams educ2
rename exams2 educ

lab var educ "highest educ qualification (4 cats)"
lab var educ2 "highest educ qualification (detailed)"


** 07. Parents' education: Father and/or Mother **

// No parents' education available


** 08. Own occupation **

rename rsc class
rename rseg class2

lab var class "R's own social class"
lab var class2 "R's own social class (detailed)"


** 09. Parents' occupation **

rename par1occ paocc
rename par2occ paocc2
rename par3occ paocc3

lab var paocc "parents occupation at 16"
lab var paocc2 "parents employee? employer?"
lab var paocc3 "parents manager?"

rename parsc paclass
lab var paclass "parent(s) social class"


** 10. Tabulate the Identified Variables **

log using /*insert you work directory here*/, replace text

** Data reading and variable selection from raw data
** UK National Survey of Sexual Attitudes and Lifestyle 2000-2001


** Sex **
tab sex,m

** Age, Birth Year **
sum age birthyr, d

** Siblings **
sum nsibs birthorder, d


** R's Own Education & Occupation **
tab1 educ educ2 class class2 ,m

** Parental Occupation **
tab1 paocc* paclass,m

log close


** 11. Keep the identified variables only

keep year country pid ///
	 sex age birthyr ///
	 nsibs birthorder ///
	 educ educ2 class* paclass paocc* ///
	 psu final_wt strata
	 
	 
** 13. Create educational years variable **

*rename educational level variable to educ_cat*
rename educ2 educ_cat

ge educ_yrs = .
replace educ_yrs = 8.433 if educ_cat == -1
replace educ_yrs = 13.737 if educ_cat == 1
replace educ_yrs = 12.211 if educ_cat == 2
replace educ_yrs = 12.211 if educ_cat == 3
replace educ_yrs = 12.035 if educ_cat == 4
replace educ_yrs = 11.280 if educ_cat == 5
replace educ_yrs = 11.280 if educ_cat == 6
replace educ_yrs = 11.280 if educ_cat == 7
replace educ_yrs = 11 if educ_cat == 8
replace educ_yrs = 11 if educ_cat == 9
replace educ_yrs = 10.625 if educ_cat == 10
replace educ_yrs = 10.625 if educ_cat == 11
replace educ_yrs = 10.625 if educ_cat == 12
replace educ_yrs = 10 if educ_cat == 13
replace educ_yrs = 10 if educ_cat == 14
replace educ_yrs = 10 if educ_cat == 15
replace educ_yrs = 11.333 if educ_cat == 16
replace educ_yrs = 8.381 if educ_cat == 17
replace educ_yrs = 8.381 if educ_cat == 99

lab var educ_yrs "Respondent's years of education"


** 12. Save the Data File **

saveold /*insert you work directory here*/, replace
